Pesquisa | Portal Regional da BVS

Improving Bayesian credibility intervals for classifier error rates using maximum entropy empirical priors.

Gustafsson, Mats G; Wallman, Mikael; Wickenberg Bolin, Ulrika; Göransson, Hanna; Fryknäs, M; Andersson, Claes R; Isaksson, Anders.

Artif Intell Med ; 49(2): 93-104, 2010 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-20347582

RESUMO

OBJECTIVE: Successful use of classifiers that learn to make decisions from a set of patient examples require robust methods for performance estimation. Recently many promising approaches for determination of an upper bound for the error rate of a single classifier have been reported but the Bayesian credibility interval (CI) obtained from a conventional holdout test still delivers one of the tightest bounds. The conventional Bayesian CI becomes unacceptably large in real world applications where the test set sizes are less than a few hundred. The source of this problem is that fact that the CI is determined exclusively by the result on the test examples. In other words, there is no information at all provided by the uniform prior density distribution employed which reflects complete lack of prior knowledge about the unknown error rate. Therefore, the aim of the study reported here was to study a maximum entropy (ME) based approach to improved prior knowledge and Bayesian CIs, demonstrating its relevance for biomedical research and clinical practice. METHOD AND MATERIAL: It is demonstrated how a refined non-uniform prior density distribution can be obtained by means of the ME principle using empirical results from a few designs and tests using non-overlapping sets of examples. RESULTS: Experimental results show that ME based priors improve the CIs when employed to four quite different simulated and two real world data sets. CONCLUSIONS: An empirically derived ME prior seems promising for improving the Bayesian CI for the unknown error rate of a designed classifier.

Assuntos

Inteligência Artificial , Teorema de Bayes , Mineração de Dados , Bases de Dados como Assunto , Sistemas de Apoio a Decisões Clínicas , Modelos Estatísticos , Algoritmos , Neoplasias da Mama/classificação , Neoplasias da Mama/diagnóstico , Simulação por Computador , Árvores de Decisões , Pesquisa Empírica , Feminino , Proteínas Fúngicas/classificação , Proteínas Fúngicas/fisiologia , Humanos , Modelos Lineares , Distribuição Normal , Valor Preditivo dos Testes , Prognóstico , Reprodutibilidade dos Testes , Vocabulário Controlado

Gene expression analysis identifies a genetic signature potentially associated with response to alpha-IFN in chronic phase CML patients.

Hagberg, Anette; Olsson-Strömberg, Ulla; Wickenberg-Bolin, Ulrika; Göransson, Hanna; Isaksson, Anders; Bengtsson, Mats; Höglund, Martin; Simonsson, Bengt; Barbany, Gisela.

Leuk Res ; 31(7): 931-8, 2007 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-17207527

RESUMO

Microarray-based gene expression analysis was performed on diagnostic chronic phase CML patient samples prior to interferon treatment. Fifteen patient samples corresponding to six cytogenetic responders and nine non-responders were included. Genes differentially expressed between responder and non-responder patients were listed and a subsequent leave-one-out cross validation (LOOV) procedure showed that the top 20 genes allowed the highest prediction accuracy. The relevant genes were quantified by real-time PCR that supported the microarray results. We conclude that it might be possible to use gene expression analysis to predict future response to interferon in CML diagnostic samples.

Assuntos

Antineoplásicos/uso terapêutico , Biomarcadores Tumorais/genética , Regulação Leucêmica da Expressão Gênica , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Interferon-alfa/uso terapêutico , Leucemia Mieloide de Fase Crônica/tratamento farmacológico , Leucemia Mieloide de Fase Crônica/genética , Adulto , Idoso , Biomarcadores Tumorais/metabolismo , Citarabina/uso terapêutico , Proteínas de Fusão bcr-abl/genética , Perfilação da Expressão Gênica , Humanos , Hidroxiureia/uso terapêutico , Leucemia Mieloide de Fase Crônica/metabolismo , Pessoa de Meia-Idade , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Neoplásico/sangue , RNA Neoplásico/genética , RNA Neoplásico/isolamento & purificação , Reação em Cadeia da Polimerase Via Transcriptase Reversa

Molecular markers for discrimination of benign and malignant follicular thyroid tumors.

Fryknäs, Mårten; Wickenberg-Bolin, Ulrika; Göransson, Hanna; Gustafsson, Mats G; Foukakis, Theodoros; Lee, Jia-Jing; Landegren, Ulf; Höög, Anders; Larsson, Catharina; Grimelius, Lars; Wallin, Göran; Pettersson, Ulf; Isaksson, Anders.

Tumour Biol ; 27(4): 211-20, 2006.

Artigo em Inglês | MEDLINE | ID: mdl-16675914

RESUMO

OBJECTIVE: To identify molecular markers useful for the diagnostic discrimination of benign and malignant follicular thyroid tumors. METHODS: A panel of thyroid tumors was characterized with expression profiling using cDNA microarrays. A robust algorithm for gene selection was developed to identify molecular markers useful for the classification of heterogeneous tumor classes. The study included tumor tissue specimens from 10 patients with benign follicular adenomas and from 10 with malignant tumors. The malignant tumors mainly consisted of clinically relevant minimally invasive follicular carcinomas. The mRNA expression level of a candidate gene, FHL1, was evaluated in an independent series of 61 tumors. RESULTS: 22 gene expression markers were identified as differentially expressed. Several of the identified genes, for example DIO1, CITED1, CA12 and FN1, have previously been observed as differentially expressed in various thyroid tumors. FHL1 was significantly underexpressed in carcinomas compared to adenomas in the independent panel of tumors. The results indicate that a small number of genes can be useful to distinguish follicular adenomas from follicular carcinomas. CONCLUSIONS: Our findings clearly corroborate previous studies and identify novel candidate molecular markers. These genes have the potential for molecular classification of follicular thyroid tumors and for providing improved understanding of the molecular mechanisms involved in thyroid malignancies.

Assuntos

Adenoma/genética , Marcadores Genéticos , Doenças da Glândula Tireoide/diagnóstico , Neoplasias da Glândula Tireoide/genética , Adenocarcinoma Folicular/diagnóstico , Adenocarcinoma Folicular/genética , Adenocarcinoma Folicular/patologia , Adenoma/diagnóstico , Adulto , Idoso , Primers do DNA , Diagnóstico Diferencial , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Invasividade Neoplásica , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase , Neoplasias da Glândula Tireoide/diagnóstico , Neoplasias da Glândula Tireoide/patologia

Improved variance estimation of classification performance via reduction of bias caused by small sample size.

Wickenberg-Bolin, Ulrika; Göransson, Hanna; Fryknäs, Mårten; Gustafsson, Mats G; Isaksson, Anders.

BMC Bioinformatics ; 7: 127, 2006 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-16533392

RESUMO

BACKGROUND: Supervised learning for classification of cancer employs a set of design examples to learn how to discriminate between tumors. In practice it is crucial to confirm that the classifier is robust with good generalization performance to new examples, or at least that it performs better than random guessing. A suggested alternative is to obtain a confidence interval of the error rate using repeated design and test sets selected from available examples. However, it is known that even in the ideal situation of repeated designs and tests with completely novel samples in each cycle, a small test set size leads to a large bias in the estimate of the true variance between design sets. Therefore different methods for small sample performance estimation such as a recently proposed procedure called Repeated Random Sampling (RSS) is also expected to result in heavily biased estimates, which in turn translates into biased confidence intervals. Here we explore such biases and develop a refined algorithm called Repeated Independent Design and Test (RIDT). RESULTS: Our simulations reveal that repeated designs and tests based on resampling in a fixed bag of samples yield a biased variance estimate. We also demonstrate that it is possible to obtain an improved variance estimate by means of a procedure that explicitly models how this bias depends on the number of samples used for testing. For the special case of repeated designs and tests using new samples for each design and test, we present an exact analytical expression for how the expected value of the bias decreases with the size of the test set. CONCLUSION: We show that via modeling and subsequent reduction of the small sample bias, it is possible to obtain an improved estimate of the variance of classifier performance between design sets. However, the uncertainty of the variance estimate is large in the simulations performed indicating that the method in its present form cannot be directly applied to small data sets.

Assuntos

Inteligência Artificial , Biomarcadores Tumorais/análise , Diagnóstico por Computador/métodos , Perfilação da Expressão Gênica/métodos , Proteínas de Neoplasias/análise , Neoplasias/diagnóstico , Análise de Variância , Viés , Humanos , Modelos Biológicos , Modelos Estatísticos , Neoplasias/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Tamanho da Amostra , Sensibilidade e Especificidade

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA